positive rate true positive rate
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Italy > Veneto > Venice (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Information Technology > Security & Privacy (1.00)
- Government (0.67)
- North America > United States > Minnesota (0.04)
- North America > Dominican Republic (0.04)
- Asia > Nepal (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > United States > Virginia (0.05)
- Europe > United Kingdom > North Sea > Southern North Sea (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Node-level Contrastive Unlearning on Graph Neural Networks
Lee, Hong kyu, Zhang, Qiuchen, Yang, Carl, Xiong, Li
Graph unlearning aims to remove a subset of graph entities (i.e. nodes and edges) from a graph neural network (GNN) trained on the graph. Unlike machine unlearning for models trained on Euclidean-structured data, effectively unlearning a model trained on non-Euclidean-structured data, such as graphs, is challenging because graph entities exhibit mutual dependencies. Existing works utilize graph partitioning, influence function, or additional layers to achieve graph unlearning. However, none of them can achieve high scalability and effectiveness without additional constraints. In this paper, we achieve more effective graph unlearning by utilizing the embedding space. The primary training objective of a GNN is to generate proper embeddings for each node that encapsulates both structural information and node feature representations. Thus, directly optimizing the embedding space can effectively remove the target nodes' information from the model. Based on this intuition, we propose node-level contrastive unlearning (Node-CUL). It removes the influence of the target nodes (unlearning nodes) by contrasting the embeddings of remaining nodes and neighbors of unlearning nodes. Through iterative updates, the embeddings of unlearning nodes gradually become similar to those of unseen nodes, effectively removing the learned information without directly incorporating unseen data. In addition, we introduce a neighborhood reconstruction method that optimizes the embeddings of the neighbors in order to remove influence of unlearning nodes to maintain the utility of the GNN model. Experiments on various graph data and models show that our Node-CUL achieves the best unlearn efficacy and enhanced model utility with requiring comparable computing resources with existing frameworks.
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Austria > Vienna (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- (5 more...)
Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation
Naseh, Ali, Peng, Yuefeng, Suri, Anshuman, Chaudhari, Harsh, Oprea, Alina, Houmansadr, Amir
Retrieval-Augmented Generation (RAG) enables Large Language Models (LLMs) to generate grounded responses by leveraging external knowledge databases without altering model parameters. Although the absence of weight tuning prevents leakage via model parameters, it introduces the risk of inference adversaries exploiting retrieved documents in the model's context. Existing methods for membership inference and data extraction often rely on jailbreaking or carefully crafted unnatural queries, which can be easily detected or thwarted with query rewriting techniques common in RAG systems. In this work, we present Interrogation Attack (IA), a membership inference technique targeting documents in the RAG datastore. By crafting natural-text queries that are answerable only with the target document's presence, our approach demonstrates successful inference with just 30 queries while remaining stealthy; straightforward detectors identify adversarial prompts from existing methods up to ~76x more frequently than those generated by our attack. We observe a 2x improvement in TPR@1%FPR over prior inference attacks across diverse RAG configurations, all while costing less than $0.02 per document inference.
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- Europe > Germany > North Rhine-Westphalia > Düsseldorf Region > Düsseldorf (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area (0.93)
- Energy > Renewable (0.92)
- Energy > Power Industry (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Autonomous Electrochemistry Platform with Real-Time Normality Testing of Voltammetry Measurements Using ML
Al-Najjar, Anees, Rao, Nageswara S. V., Bridges, Craig A., Dai, Sheng, Walters, Alex
Electrochemistry workflows utilize various instruments and computing systems to execute workflows consisting of electrocatalyst synthesis, testing and evaluation tasks. The heterogeneity of the software and hardware of these ecosystems makes it challenging to orchestrate a complete workflow from production to characterization by automating its tasks. We propose an autonomous electrochemistry computing platform for a multi-site ecosystem that provides the services for remote experiment steering, real-time measurement transfer, and AI/ML-driven analytics. We describe the integration of a mobile robot and synthesis workstation into the ecosystem by developing custom hub-networks and software modules to support remote operations over the ecosystem's wireless and wired networks. We describe a workflow task for generating I-V voltammetry measurements using a potentiostat, and a machine learning framework to ensure their normality by detecting abnormal conditions such as disconnected electrodes. We study a number of machine learning methods for the underlying detection problem, including smooth, non-smooth, structural and statistical methods, and their fusers. We present experimental results to illustrate the effectiveness of this platform, and also validate the proposed ML method by deriving its rigorous generalization equations.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.69)
- Information Technology (0.67)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Architecture (1.00)
ORACLE: A Real-Time, Hierarchical, Deep-Learning Photometric Classifier for the LSST
Shah, Ved G., Gagliano, Alex, Malanchev, Konstantin, Narayan, Gautham, Collaboration, The LSST Dark Energy Science
ORACLE is a recurrent neural network with Gated Recurrent Units (GRUs), and has been trained using a custom hierarchical cross-entropy loss function to provide high-confidence classifications along an observationally-driven taxonomy with as little as a single photometric observation. Contextual information for each object, including host galaxy photometric redshift, offset, ellipticity and brightness, is concatenated to the light curve embedding and used to make a final prediction. Training on 0.5M events from the Extended LSST Astronomical Time-Series Classification Challenge, we achieve a top-level (Transient vs Variable) macro-averaged precision of 0.96 using only 1 day of photometric observations after the first detection in addition to contextual information, for each event; this increases to >0.99 once 64 days of the light curve has been obtained, and 0.83 at 1024 days after first detection for 19-way classification (including supernova sub-types, active galactic nuclei, variable stars, microlensing events, and kilonovae). We also compare ORACLE with other state-of-the-art classifiers and report comparable performance for the 19-way classification task, in addition to delivering accurate top-level classifications much earlier. The code and model weights used in this work are publicly available at our associated GitHub repository.
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (5 more...)
An experimental study on fairness-aware machine learning for credit scoring problem
Thu, Huyen Giang Thi, Doan, Thang Viet, Quy, Tai Le
Digitalization of credit scoring is an essential requirement for financial organizations and commercial banks, especially in the context of digital transformation. Machine learning techniques are commonly used to evaluate customers' creditworthiness. However, the predicted outcomes of machine learning models can be biased toward protected attributes, such as race or gender. Numerous fairness-aware machine learning models and fairness measures have been proposed. Nevertheless, their performance in the context of credit scoring has not been thoroughly investigated. In this paper, we present a comprehensive experimental study of fairness-aware machine learning in credit scoring. The study explores key aspects of credit scoring, including financial datasets, predictive models, and fairness measures. We also provide a detailed evaluation of fairness-aware predictive models and fairness measures on widely used financial datasets.
- Asia > Vietnam (0.05)
- Europe > United Kingdom (0.04)
- North America > United States (0.04)
- (5 more...)